1 


The  Construction  of  a  Vague  Fuzzy  Measure 
Through  L 1  Parameter  Optimization 

Evgeni  Dimitrov,  Hayden  Schaeffer,  David  Wen, 

Sandra  Rankovic,  Kizza  Nandyose,  and  Olivier  Thonnard 


Abstract — 

This  paper  presents  a  method  to  construct  an  aggregation 
function,  reflecting  a  complex  set  of  initial  user  preferences,  which 
can  be  used  in  the  framework  of  multi-criteria  decision  making. 
We  consider  problems  where  the  decision  maker  can  provide 
information  about  the  importance  and  interactions  between 
criteria,  as  well  as  a  desired  portion  of  criteria  to  be  satisfied. 
The  proposed  aggregation  process  is  a  vague  Choquet  integral, 
whose  parameters  are  constructed  in  two  steps.  First,  we  solve 
a  convex  constrained  L 1  optimization  problem  to  obtain  a  fuzzy 
measure  reflecting  the  importances  and  interactions  between 
the  criteria.  Then  the  measure  is  transformed  by  a  monotonic 
mapping  to  include  vague  information  on  what  portion  of  criteria 
has  to  be  satisfied.  The  proposed  approach  provides  an  automated 
construction  of  an  aggregation  function,  which  is  completely  free 
of  data  learning  and  manual  processing.  In  addition,  this  method 
provides  a  novel  fuzzy  measure  that  integrates  two  different 
classes  of  information  -  importance/interactions  of  criteria  and 
vague  statements. 

Index  Terms — Multiple  criteria  decision  aid,  Choquet  integral, 
LI  optimization,  vague  statements,  importance  and  interactions 

I.  Introduction 

ULTI-criteria  decision  analysis  (MCDA)  is  a  sub¬ 
discipline  of  operations  research  that  focuses  on  mak¬ 
ing  decisions  based  on  the  combined  information  provided  by 
a  set  of  criteria.  It  often  involves  choosing  the  best  alternative 
from  a  given  set  of  options  by  integrating  several  expert 
opinions.  MCDA  has  found  numerous  applications  in  various 
areas  such  as  transportation,  sustainable  energy,  corporate  and 
financial  decision  making  (see  e.g.,  [45],  [9],  [49],  [50])  and 
recently  cyber  crime  attack  attribution  problems  [40].  A  par¬ 
ticular  method  of  combining  information  in  MCDA  problems 
is  through  aggregation  functions.  Aggregation  functions  (or 
aggregation  operators)  can  be  applied  in  situations  where  one 
has  to  choose  an  optimal  element  from  a  given  set  of  actions 
or  objects,  which  can  be  compared  with  respect  to  different 
features.  A  typical  problem  involves  having  some  numeric 
representation  ( scores )  of  how  well  each  object  satisfies  each 
of  a  set  of  criteria.  Once  those  numbers  are  known,  one’s 
aim  is  to  combine  the  given  scores,  i.e.  aggregate  them  into 
a  global  value,  which  provides  an  overall  measure  of  how 
good  the  object  is.  The  global  scores  can  then  be  used  to 
rate  the  different  alternatives  and  consequently  decide  which 
is  optimal. 

The  combination  of  scores  is  achieved  by  an  aggregation 
function,  which  in  all  generality  is  simply  a  multi-variable 
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function,  that  has  a  real  output.  Some  typical  examples  of 
aggregation  functions  are  the  arithmetic  and  weighted  mean. 
Those  are  usually  used  in  problems  where  one  can  rate 
the  importance  of  the  different  criteria  using  weights ,  which 
reflect  how  much  the  final  aggregated  value  depends  on 
each  individual  score  (see  e.g.  [34],  [44]).  When  applying 
linear  models  like  the  weighted  mean  one  assumes  that  dif¬ 
ferent  features  are  independent,  however  in  more  complex 
environments  they  could  interact  (see  e.g.  [30],  [11],  [39]). 
Specifically,  it  could  be  the  case  that  several  criteria  point  to 
similar  information  regarding  the  object  and  while  each  may 
be  individually  very  important,  their  combined  effect  on  the 
aggregated  value  should  not  be  much  higher  than  the  effect 
of  each  independently.  In  the  case  described  we  would  call 
the  criteria  or  features  redundant.  Conversely,  one  could  have 
that  several  criteria  capture  complementary  information  so  that 
their  overall  effect  is  much  higher  than  the  individual  effect 
of  each.  We  refer  to  the  latter  as  being  complementary  or 
synergetic. 

The  typical  example  that  appears  in  the  literature  to  illustrate 
such  interactions  is  the  student  ranking  problem  (see  [16]). 
In  this  problem  one  has  all  the  grades  in  various  subjects 
of  each  student  and  wishes  to  rank  the  latter  based  on  their 
overall  performance.  The  additional  assumption  made  is  that 
subjects  are  not  independent.  For  example,  if  a  student  is 
good  at  mathematics  he/she  is  very  likely  to  be  good  at 
physics  as  well.  Thus  the  two  grades  in  mathematics  and 
physics  point  to  similar  student  qualities  and  are  redundant. 
Conversely,  good  grades  in  mathematics  and  literature  indicate 
that  a  student  is  “well  rounded"  and  are  complementary. 
Linear  models  like  the  weighted  average  are  incapable  of 
capturing  the  complex  relationships  we  just  described,  which 
necessitates  the  application  of  different  aggregation  functions. 
The  typical  aggregation  function  used  for  reflecting  criteria 
interactions  is  the  Choquet  integral  (see  [16],  [30],  [32],  [21]). 
The  formulation  of  the  Choquet  integral  makes  it  particularly 
useful  for  modeling  complex  criteria  relationships  and  it  has 
become  a  central  tool  for  addressing  problems  in  MCDA, 
finding  numerous  practical  applications  (see  the  surveys  in 
[32],  [21],  [26]). 

Another  variation  of  the  aggregation  problem  involves  the 
case  when  one  rates  objects  based  on  the  portion  of  criteria 
they  satisfy  without  any  preference  on  which  exactly.  For 
example,  a  decision  maker  (DM)  might  value  objects,  which 
satisfy  at  least  half  of  a  set  of  criteria  well,  regardless  of 
how  they  perform  on  the  other  half.  Alternatively,  a  DM 
might  wish  to  disregard  the  best  and/or  most  poorly  satisfied 
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criterion  when  evaluating  an  object  due  to  some  possible  bias. 
In  the  examples  described,  the  rating  of  an  element  only 
depends  on  the  set  of  scores  it  has  for  each  criterion  and  is 
invariant  under  permutation  of  those  scores.  This  introduction 
of  nonlinearity  renders  the  weighted  average  inapplicable  and 
several  operators  have  been  proposed  for  this  kind  of  problem 
such  as  the  ordered  weighted  average  (OWA)  and  the  weighted 
OWA  (WOWA)  [46],  [43].  Operators  in  the  OWA  family  have 
been  used  for  rating  objects  based  on  preferences  of  the  form: 
“assign  a  high  aggregated  value  if  at  least  half  of  the  scores 
are  high”.  The  example  just  given  is  often  referred  to  as  a 
vague  statement  and  one  can  obtain  different  vague  statements 
by  replacing  “at  least  half’,  with  “most”,  “some”,  “all”  etc. 
Linguistic  quantifiers  such  as  “most”  and  “some”  are  often 
open  to  interpretation  which  is  why  statements  specified  by 
them  are  called  vague  [48].  Over  the  last  couple  of  decades 
OWA  operators  have  found  numerous  applications  in  various 
fields  like  neural  networks  ([7],  [47]),  geographical  informa¬ 
tion  systems  ([24],  [35])  and  group  decision  making  under 
linguistic  assessments  ([22],  [23]).  We  remark  that  both  the 
OWA  and  WOWA  are  special  cases  of  the  Choquet  integral, 
however  the  two  types  of  aggregation  operators  have  largely 
been  used  in  different  kinds  of  problems  (see  [16],  [21]). 

The  two  variations  of  the  aggregation  problem  presented 
above  have  been  extensively  studied  and  analyzed  in  [48],  [12], 
[14],  [18],  [33],  [16],  [21];  however,  there  is  no  one  universally 
accepted  method  for  approaching  problems  that  present  both 
types  of  preferences.  It  is  thus  the  purpose  of  this  paper  to 
develop  an  aggregation  function  that  is  capable  of  reflecting  a 
complex  set  of  user  preferences,  which  include: 

•  relative  importance  of  different  criteria 

•  interactions  between  criteria 

•  vague  statements. 

In  particular  we  will  assume  that  the  user  (decision  maker) 
can  rate  the  importance  of  different  criteria,  has  information 
about  the  way  they  interact  and  has  the  additional  preference 
of  a  vague  statement.  In  our  proposed  approach,  which  will 
be  described  in  more  detail  in  the  next  section,  we  will 
try  to  combine  some  of  the  benefits  of  the  OWA  and  the 
Choquet  integral.  We  will  take  advantage  of  the  ability  of  the 
OWA  and  the  Choquet  integral  to  model  vague  statements  and 
interactions  between  criteria  respectively  and  incorporate  those 
features  into  a  final  aggregation  function  that  reflects  all  of  the 
above  user  preferences. 

In  order  to  take  advantage  of  the  two  operators  we  first 
need  to  construct  the  sets  of  parameters  that  define  them. 
In  many  cases  the  parameters  are  designed  manually  by 
an  expert  of  the  decision  problem,  which  is  being  modeled 
(for  some  recent  examples  see  [41],  [40]).  However,  if  the 
number  of  criteria  is  N  we  have  that  the  OWA  is  defined 
by  N  weights,  while  the  Choquet  integral  by  2 N  parameters. 
The  high  parameter  complexity  of  the  models  makes  manual 
construction  extremely  difficult  as  N  becomes  large.  Thus  for 
practical  purposes  we  design  our  method  of  construction  to  be 
automated. 

The  framework,  which  we  will  use  for  designing  the 
parameters  of  the  Choquet  integral  is  optimization.  Some 
applications  of  optimization  for  finding  the  parameters  of  the 


Choquet  integral  can  be  found,  among  many  others,  in  [14], 
[33],  [20],  [29],  [32],  [25]  .  In  most  of  these  papers  the  authors 
assume  that  one  has  a  set  of  rated  examples,  which  can  be  used 
as  a  basis  for  the  construction  of  a  fuzzy  measure.  In  practice 
this  may  not  be  true  as  can  be  seen  in  [41],  [40],  where  due 
to  insufficient  data  the  authors  constructed  their  parameters 
manually.  We  thus  develop  our  model  to  be  free  of  any  data 
learning.  Avoiding  data  learning  makes  our  approach  much 
more  applicable  to  areas  where  there  is  a  lack  of  sufficient  data 
or  where  data  quickly  becomes  outdated  and  unusable  for  any 
decision  making.  We  only  assume  that  the  user  can  provide 
information  on  the  preferences  listed  above  and  not  necessarily 
a  rating  of  different  examples,  although  this  modification  can 
easily  be  incorporated. 

Optimization  (also  known  as  energy  minimization)  is  used 
in  many  fields  to  solve  ill-posed  inverse  problems.  Typically, 
one  wants  to  solve  the  problem  Au  =  /,  where  /  is  given  but 
A  is  ill-conditioned.  The  problem  is  solved  by  minimizing  the 
following  quantity: 

F(u)  =  ||u||  +  ^\\Au-f\\l 

where  1 1  —  1 1  is  a  regularizer  (commonly  in  the  form  of  a 
norm  or  semi-norm).  In  the  fields  of  compressive  sensing  (see 
[10],  [28],  [4])  and  image  processing  (see  [27],  [36],  [3]), 
sparse  regularization  is  frequently  used  in  order  to  reconstruct 
a  vector  u,  which  has  few  non-zero  elements.  The  optimal 
regularizer  is  the  /."-“norm”;  however,  a  convexified  version 
can  be  found  by  replacing  it  with  the  L1  -norm.  The  resulting 
minimization  is  as  follows. 

mini7'(u)  =  min  ||u||i  +  — ||Au  — /Hi 

U  U  2 

=  min  Y  I ui  I  +  \  WAu  ~  /I  I2 

j 

Other  L 1  type  regularizes  include  the  total  variation  (TV) 
semi-norm  (i.e.  ||Vtt||)  and  the  nuclear  norm  (i.e.  the  sum  of 
the  singular  values).  The  TV  semi-norm  induces  sparsity  in 
the  jump  set  of  the  data  (see  [36])  while  the  nuclear  norm 
induces  pattern  sparsity  in  large  data  sets  (see  [5],  [37]). 

In  this  paper,  we  will  use  the  L 1  norm  to  regularize 
the  interaction  parameters  within  the  model  of  the  Choquet 
integral,  which  is  supported  by  the  theoretical  well-posedness 
of  a  2-additive  fuzzy  measure  (see  Section  III).  The  parameter 
/  is  provided  by  the  user  and  represents  his/her  belief  on  what 
the  interactions  should  be,  while  the  matrix  A  is  a  binary 
operation,  which  is  1  if  an  interaction  is  given  and  0  otherwise. 

The  remainder  of  the  paper  is  organized  as  follows.  In 
Section  II  we  describe  the  format  of  the  preferences  provided 
by  the  user  as  well  as  the  general  strategy  of  constructing  our 
final  aggregation  function.  In  the  third  section  we  summarize 
some  of  the  basic  facts  and  definitions  pertaining  to  the  Cho¬ 
quet  integral.  Section  IV  develops  the  optimization  framework 
we  employ  to  construct  the  fuzzy  measure,  reflecting  criteria 
importance  and  interactions.  In  the  following  section  we 
design  the  weights  for  the  OWA,  based  on  the  vague  statement. 
Subsequently,  we  combine  the  two  sets  of  parameters  into 
a  vague  fuzzy  measure,  which  models  our  final  aggregation 
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function.  In  Section  VI  we  analyze  the  performance  of  our 
operator  on  two  examples  and  conclude  with  some  closing 
remarks  in  Section  VII. 

II.  Outline  of  Construction  Method 

As  we  mentioned  in  the  previous  section  our  goal  is  to 
construct  an  aggregation  function  that  captures  a  complex 
set  of  user  preferences  including  relative  importance  and 
interactions  of  criteria  and  vague  statements.  We  will  assume 
that  a  user  presents  this  information  in  the  following  form: 

•  Importance:  The  relative  importance  of  criteria  is  pro¬ 
vided  via  statements  like  “criterion  1  is  twice  as  important 
as  criterion  2”. 

•  Interactions:  The  interaction  between  two  criteria  i  and  j 
is  given  via  a  number  between  —1  and  1.  Negative  values 
are  interpreted  as  i  and  j  being  redundant,  i.e.  pointing  to 
similar  information,  while  positive  values  are  interpreted 
as  the  two  criteria  being  synergetic,  i.e.  pointing  to 
complementary  information.  The  value  0  implies  that  the 
two  criteria  are  independent. 

•  Vague  statement:  A  linguistic  quantifier  is  submitted  and 
it  determines  how  many  scores  need  to  be  high  in  order 
to  maximize  the  final  aggregated  value. 

As  mentioned  in  the  introduction,  the  formulation  of  the 
Choquet  integral  makes  it  exceptionally  suitable  for  modeling 
the  importance  and  interactions  of  criteria,  while  the  OWA 
appropriately  models  vague  statements.  Thus  as  an  initial  step 
we  build  the  parameters  underlying  these  two  aggregation 
functions.  The  fuzzy  measure  for  the  Choquet  integral  is 
constructed  based  on  the  information  given  for  the  importance 
and  interactions,  while  the  weights  for  the  OWA  are  built  based 
solely  on  the  vague  statement. 

For  this  first  step  we  will  assume  a  2-additive  model  for 
the  Choquet  integral,  which  will  provide  us  with  a  sufficiently 
powerful  framework  for  representing  the  types  of  preferences 
described  above.  The  2-additive  Choquet  model  is  specified 
by  two  kinds  of  parameters  called  importance  indices  and 
interaction  indices.  As  the  names  suggest  they  measure  the 
importance  of  each  criterion  and  the  interaction  between  pairs 
of  criteria  respectively.  We  will  construct  these  parameters, 
by  minimizing  an  energy  function  that  reflects  properties  that 
we  view  as  desirable  for  an  optimal  solution.  Subsequently 
we  design  weights  for  an  OWA  operator.  We  will  use  RIM 
quantifiers  of  the  truncated  Gaussian  distribution  to  construct 
those  weights,  which  provides  a  natural  and  well  behaved 
solution. 

In  the  second  step  of  construction  we  use  both  sets  of 
parameters  designed  in  the  first  step  to  create  a  fuzzy  measure, 
which  will  be  a  basis  for  the  Choquet  integral  that  is  our  final 
aggregation  function.  For  clarity,  we  will  refer  to  the  latter  as 
a  vague  fuzzy  measure  and  a  vague  Choquet  integral  to  stress 
their  dependence  on  the  vague  statement.  The  purpose  of  the 
vague  Choquet  integral  is  to  combine  the  information  reflected 
by  the  parameters  of  the  OWA  and  the  Choquet  integral.  It 
acts  by  increasing  the  aggregated  value  of  examples  which 
both  models  render  good  and  penalizing  the  aggregated  score 
of  examples  on  which  the  two  disagree. 


The  next  section  provides  some  of  the  basic  definitions  and 
results  for  the  Choquet  integral  and  thus  may  be  skipped  by 
those  who  are  familiar  with  the  concept.  We  refer  the  reader 
to  [1],  where  one  can  find  additional  properties  of  the  discrete 
Choquet  integral  as  well  as  the  OWA  function. 

III.  Preliminary  Discussion  on  Fuzzy  Measures 

For  the  rest  of  the  paper,  we  assume  A/”  to  be  a  set  of  n 
elements,  2^  to  be  the  power  set  of  Af  and  z  to  be  a  vector 
in  R,  whose  entries  are  real  values  between  0  and  1. 

Definition  1.  A  capacity  [8]  or  fuzzy  measure  [38]  is  a  set 
function  v  :  2^  -A  [0, 1],  satisfying  the  following  properties: 

1)  v{%)  =  0 

2)  A  C  B  implies  is  (A)  <  is(B) 

The  fuzzy  measure  is  normalized  if  in  addition  ts(Af)  =  1. 

Definition  2.  The  (discrete)  Choquet  integral  of  an  input 
vector  z  with  respect  to  a  fuzzy  measure  is  is  given  by 

n 

CV(z)  =  ^2z{i)[is({j\zj  >  z(i)})  -  is({j\zj  >  z(i+i)})]  (1) 

i= 1 

Where  zru  <  2(2)  <  ...  <  2(n),  i.e.  zu\  is  the  ith  largest 
component  of  the  input  vector  z.  In  addition,  we  use  the 
convention  that  Z(n+ 1)  =  oo. 

Within  this  paper  fuzzy  measures  are  always  assumed  to  be 
normalized.  We  in  addition  adopt  the  notation  z  \  to  mean 
the  vector,  whose  components  are  the  entries  of  z  sorted  in 
descending  order. 

The  formulation  of  the  Choquet  integral,  requires  the  defi¬ 
nition  of  2n  —  2  parameters,  corresponding  to  the  subsets  of 
AT  (excluding  0  and  AT,  whose  values  are  fixed  to  be  0  and  1 
respectively).  The  exponential  complexity  of  the  model  often 
makes  it  difficult  to  construct  parameters  that  exhibit  certain 
desirable  properties,  which  necessitates  the  use  of  simpler 
formulations  that  are  easier  to  mold.  A  particular  simplification 
introduced  in  [18]  is  the  notion  of  k-additive  fuzzy  measures, 
k-additive  fuzzy  measures  are  families  of  fuzzy  measures, 
ranging  from  the  additive  case  ( k  =  1)  to  the  general  case 
( k  =  n).  The  formal  definition  is  as  follows. 

Definition  3.  A  fuzzy  measure  is  is  said  to  be  at  most  k-additive 
(1  <  k  <  n)  if  its  Mobius  transform  satisfies 

M„(A)  =  0 

for  any  subset  A  with  more  than  k  elements,  \A\  >  k.  A  fuzzy 
measure  is  is  k-additive  if  in  addition  there  exists  a  subset 
B  C  Af  with  k  elements  such  that  Aiv(B)  0,  where  the 
Mobius  transform  of  a  fuzzy  measure  is  is  a  set  function  defined 
for  every  A  C  Af  as 

MV(A)  =  53  (- 

BCA 

In  this  paper  we  are  primarily  focusing  on  2-additive  fuzzy 
measures,  which  extend  the  basic  additive  model  by  allowing 
for  interactions  between  elements  of  Af.  In  the  case  of  a  2- 
additive  measure  one  can  obtain  a  different  representation  of 
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the  Choquet  integral,  which  involves  an  easier  to  interpret  set 
of  parameters  called  the  importance  and  interaction  indices. 


Definition  4.  The  importance  (Shapley)  index  of  an  element  i 
£  A f  w.r.t.  a  fuzzy  measure  v  is  given  by 


AQAf\i 


\a\-i)\\a\\ 

n! 


K4U{*}) 


»{A)\ 


The  Shapley  value  is  the  vector  Is  =  (I i, In). 


Definition  5.  The  interaction  index  between  two  elements  if 
£  Af  w.r.t.  a  fuzzy  measure  v  is  given  by 


!ii  =  E 

A<zM\{i,j} 


(n 


1^1-2)!|^|! 

(n  —  1)! 


[v(Au{i,j})~ 


u{A  U  {*})  -  v{A  U  {j})  +  v{A)\ 


An  alternative  representation  of  the  interaction  index  between 
i  and  j  can  be  given  in  terms  of  the  Mobius  transform  of  the 
fuzzy  measure  v: 


B\i,jeB  1  1 

An  importance  index  measures  the  contribution  of  a  specific 
element  in  all  possible  coalitions,  i.e.  the  “benefit”  of  adding 
that  element  to  an  already  constructed  subset  of  Af.  Similarly, 
the  interaction  index  measures  the  average  contribution  of 
adding  a  pair  of  elements  to  a  given  set,  as  opposed  to  adding 
just  one  of  the  elements.  If  It)  >  0  the  elements  i  and  j 
are  said  to  be  synergetic  and  if  /(J  <  0  they  are  said  to 
be  redundant.  The  notion  of  the  interaction  index  has  been 
extended  by  Grabisch  in  [17],  to  represent  interaction  among 
arbitrary  subsets  of  Af.  However,  since  we  are  working  with 
a  2-additive  fuzzy  measure,  we  only  concern  ourselves  with 
the  sets  of  parameters  -  /,  and  Il3 .  In  fact,  it  can  be  proved 
(see  for  example  [18],  [21])  that  these  parameters  completely 
determine  the  fuzzy  measure  if  it  is  2-additive  and  one  can 
represent  the  Choquet  integral  alternatively  in  the  following 
form. 


^  ^  A  A}  2  Ay  (2) 

ie  N 

We  end  this  section  with  a  set  of  useful  identities  when  v  is 
2-additive,  which  are  going  to  be  used  extensively  further  in 
the  paper.  The  properties  listed  below  are  easily  verified  from 
the  given  definitions. 


E7*  =  l 


v  is  normalized 


i— 1 


7*  -  9  E  | Iij\  \/i  £  Af  -£=>  v  is  monotone 


(3) 


iAj 


Notice  that  the  condition  that  the  fuzzy  measure  is  monotone 
is  related  to  the  L 1  norm  of  the  interactions.  This  relationship 
provides  support  that  Ll  regularization  is  natural  for  optmizing 
the  parameters  of  a  Choquet  integral. 


IV.  Optimization  and  Fuzzy  Measures 

As  mentioned  in  the  first  two  sections  it  is  our  desire  to 
build  an  aggregation  function  for  modeling  a  wide  range  of 
preferences  of  a  DM.  Since  we  are  working  with  a  rather 
complex  set  of  preferences,  we  split  this  problem  in  two  parts. 
In  the  first  we  construct  a  2-additive  fuzzy  measure,  which 
reflects  the  relative  importance  and  interactions  among  criteria 
as  well  as  a  set  of  weights  that  capture  the  provided  vague 
statement.  In  the  second  step  we  use  both  sets  of  parameters 
and  based  on  them  construct  a  vague  fuzzy  measure  that 
captures  all  user  preferences.  This  section  describes  how  we 
obtain  the  2-additive  fuzzy  measure  as  the  unique  minimizer 
of  a  convex  optimization  problem. 

The  application  of  optimization  for  designing  fuzzy  mea¬ 
sures  has  been  studied  extensively  in  previous  works  on 
MCDA  [33],  [14],  [20],  [29],  [32],  [31],  [25].  Due  to  the 
complex  nature  of  the  Choquet  integral  one  often  needs 
an  automated  way  to  construct  its  parameters.  A  common 
approach  is  to  define  an  objective  function,  reflecting  some 
desired  property.  The  function  is  then  optimized  under  a  set 
of  constraints,  imposed  by  the  preferences  of  the  DM  and  by 
the  theoretical  constraints  of  the  model.  In  [33], [14],  [20]  the 
Choquet  integral  was  defined  as  the  unique  minimizer  of  the 
total  squared  error,  over  a  set  of  alternatives.  In  particular, 
the  DM  assigns  some  values  to  different  options  and  then  the 
Choquet  integral  is  determined,  by  the  fuzzy  measure,  which 
gives  the  smallest  deviation  from  those  values.  A  different 
approach,  adopted  in  [29],  [32],  [31],  maximizes  the  difference 
in  overall  scores  among  alternatives.  I.e.  given  some  ordinal 
information  for  a  set  of  options,  one  finds  the  measure  that 
maximizes  the  overall  distance  between  these  options,  thus 
optimally  differentiating  between  them. 

The  methods  described  above  depend  strongly  on  learning 
data  provided  by  the  DM.  Since  we  desire  our  construction 
to  be  independent  of  data  learning,  we  cannot  use  either  of 
them  and  thus  approach  the  problem  differently.  Our  aim  is 
to  only  use  information  provided  by  the  DM  on  the  relative 
importance  and  interactions  among  criteria,  without  asking 
him/her  to  rate  alternatives  in  any  way.  The  objective  function 
that  we  optimize  is  then  going  to  model  the  general  structure 
of  the  measure,  subject  to  those  constraints.  In  this  context, 
our  approach  is  similar  to  the  one  used  by  Kojadonovic  in  [25] 
with  two  major  differences.  Firstly,  the  energy  function  that 
we  develop  aims  to  obtain  a  sparse  representation  of  the  inter¬ 
actions  among  criteria,  for  reasons  that  will  be  explained  later 
in  this  section.  Secondly,  we  strongly  take  advantage  of  the  2- 
additive  form  of  the  measure,  and  do  not  aim  to  maximize  its 
uncertainty  as  in  [25],  since  not  all  available  information  has 
been  introduced  during  the  first  step  of  construction.  Given 
our  distinct  goals,  we  propose  a  novel  objective  function  and 
formulation  of  the  problem. 


A.  The  Data 

We  assume  that  the  DM  provides  the  following  set  of 
preferences,  based  on  his/her  domain  knowledge : 

•  Linear  relationships  between  the  importances  of  criteria 
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•  Values  for  the  interactions  between  two  criteria  on  the 
scale  [—1,1]. 

The  importances  of  criteria  are  connected  via  statements 
like  “Criterion  1  is  twice  as  important  as  criterion  2”.  The 
numerical  value  entered  for  the  interaction  between  criteria  i 
and  j  is  denoted  by  if.  If  no  information  is  provided  on  how 
i  and  j  interact  the  value  if  is  set  to  zero  (i.e.  the  criteria  are 
independent).  A  positive  value  implies  that  the  DM  believes 
the  criteria  are  synergetic  and  a  negative  value  -  that  they 
are  redundant.  We  interpret  the  provided  cardinal  values  as 
the  DM’s  “initial  guess”  on  the  interaction  indices  in  our  2- 
additive  fuzzy  measure. 

It  is  difficult  to  motivate  how  the  DM  comes  up  with 
the  above  values  for  the  interactions.  This  problem  also 
appears  in  [25]  where  the  author  includes  equations  such  as 
Iij  =  to  (where  to  is  some  numerical  constant)  within  the  set 
of  possible  constraints.  These  equations  uniquely  determine 
the  interaction  indices,  although  the  author  does  not  restrict 
himself  to  2-additive  fuzzy  measures.  In  our  case,  we  aim 
to  obtain  a  2-additive  fuzzy  measure,  which  imposes  much 
stricter  conditions  than  the  general  case.  In  order  to  overcome 
this  issue,  we  relax  the  condition  of  setting  the  interaction 
indices  exactly  as  prescribed  by  the  user.  Instead,  we  add  a 
fidelity  term  in  our  energy  function  that  minimizes  the  distance 
to  the  desired  user  values,  while  maintaining  the  2-additive 
structure  of  the  solution.  The  fidelity  term  ensures  that  the 
values  obtained  for  the  interaction  indices  are  close  to  the 
one  provided  by  the  user,  without  necessarily  imposing  strict 
equality.  The  motivation  for  this  relaxation  of  the  constraints 
lies  in  our  assumption  that  the  initial  parameter  set  does  not 
necessarily  define  a  well-posed  fuzzy  measure. 

B.  Feasible  Set 

The  next  step  in  our  problem  is  deriving  the  constraints 
in  our  optimization  problem  from  the  preferences  of  the  DM. 
Similarly  to  [30]  and  [25],  we  translate  preferences  on  criterion 
importance  into  linear  constraints  using  the  importance  indices 
of  a  fuzzy  measure.  For  example  I\  —  21-2  =  0,  would  be 
the  interpretation  of  “criterion  1  is  twice  as  important  as 
criterion  2”.  In  general,  the  k- th  preference  can  be  written  as 
i  ctikli  =  0,  resulting  in  a  linear  system  with  A  =  [a,fc] 
as  the  linear  operator.  The  parameters  if  will  be  used  in 
the  definition  of  our  objective  function,  and  not  used  as 
constraints. 

Remark:  Our  approach  does  not  require  that  the  relationship 
between  the  importance  indices  is  given  via  linear  equalities. 
The  approach  admits  any  dependency  that  leads  to  a  convex  set 
of  feasible  values.  In  particular,  one  can  define  relationships 
via  inequalities  of  the  form  /i  —  I2  >  0,  to  indicate  that  the 
first  criterion  is  more  important  than  the  second  or  as  in  [30] 
and  [25]  use  I\  — 12  >  5,  with  some  predefined  small  positive 
value  S. 

From  the  theory,  the  constructed  measure  needs  to  be 
normalized  and  monotone  in  order  to  be  well  posed.  The 
latter  conditions  were  provided  in  equation  3  at  the  end  of 
the  previous  section.  Altogether,  the  (convex)  set  of  feasible 


solutions  ( )  is: 

n  n  1  I 

E =  0 vfc,  e h  =  1,  u >  1 1^  1 1  (4) 

*=1  i=1  J 

In  general,  as  long  as  the  Ker(A)  is  “large  enough,”  the  set  A 
stays  non-empty.  In  practice,  as  long  as  the  equality  relations 
between  the  importances  indices  remain  consistent,  a  solution 
will  exist. 

C.  Objective  function  for  parameter  estimation 

We  propose  the  following  objective  (energy)  function  to  be 
minimized. 

Mi  )  —  tX  Y  '  Ii  log  I;  A 

(5) 

E  (i4i+/3(%-4)2), 

i,je  A 

where  a  and  8  are  fixed  positive  constants.  Both  the  function 
and  feasible  set  is  convex,  providing  a  unique  minimizer  (when 
one  exists).  Also  note  that  although  the  energy  function  does 
not  couple  /,  and  It;) ,  the  constraints  that  they  lie  in  A  does. 

The  first  term  of  equation  5,  represents  the  entropy  of  the 
importance  indices,  which  is  chosen  to  provide  an  appropriate 
distribution  for  the  values  of  /,  even  when  little  information 
is  provided.  This  term  also  discourages  /,  from  being  zero, 
therefore  every  criterion  is  taken  into  account  in  the  decision 
process.  The  entropy  term  ensures  all  criteria  are  considered 
nearly-equal  in  terms  of  importance  when  no  information  is 
provided  to  the  contrary. 

The  second  term  in  equation  5  is  the  L1  -norm  of  the  interac¬ 
tion  parameters,  encouraging  sparsity  in  the  interactions  It/ .  A 
sparse  distribution  of  interaction  parameters  is  well  motivated 
by  both  theory  and  practice.  In  terms  of  theory,  being  a  well- 
posed  fuzzy  measure  translates  to  an  if  constraint  on  Lt] 
(in  terms  of  If).  In  addition,  the  nature  of  the  2-additive 
fuzzy  measure  implies  that  complex  interactions  (higher  order 
behaviors)  between  the  criteria  must  be  ignored  for  well- 
posedness.  In  terms  of  practice,  one  assumes  that  only  some 
criteria  interact,  otherwise  the  given  criteria  may  not  be  rea¬ 
sonable.  A  low  amount  of  interacting  criteria  implies  sparsity 
of  the  interaction  parameters.  In  general,  the  interactions  of 
higher  (relative)  magnitude  are  more  influential  in  the  overall 
aggregation  process,  so  that  the  idea  of  sparsity  is  consistent. 
Thus  having  an  if  -norm  in  the  energy  function  allows  us  to 
obtain  a  solution  that  captures  the  basic  relationships  between 
criteria,  without  leaving  the  framework  of  2-additivity. 

As  mentioned  before  the  parameters  if  represent  the  user’s 
belief  of  what  the  interactions  should  be.  Of  course,  it  could 
be  the  case  that  the  input  values  are  not  feasible  for  a  2- 
additive  fuzzy  measure  or  inconsistent  in  some  other  way. 
In  this  regards,  we  include  an  L 2  fidelity  term  to  keep  our 
solution  close  to,  but  not  exactly  equal  to,  the  initial  data  set. 
This  norm  is  used  in  L2  data  fitting,  for  example,  least  squares 
fitting.  Minimizing  this  norm  maintains  a  level  of  closeness 
between  the  input  and  output  interactions.  In  general,  any  p- 
norm  with  p  >  1  can  be  used;  however,  L2  is  chosen  since 
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in  the  unconstrained  case,  the  minimizers  are  found  by  soft 
thresholding,  and  this  provides  the  desired  sparsity  behavior. 
Altogether,  the  proposed  optimization  is  as  follows. 

minimize  {4,  Kje.v) 

{h}ieM  j  e  A/" 

subject  to  ({Iijie/s,  { hj}i,jeN )  G  A 

The  above  is  a  convex  optimization  problem,  which  has  a 
unique  solution  and  can  be  solved  using  well-developed  meth¬ 
ods,  for  example,  Lagrange  multipliers  or  Bregman  iterations. 
In  order  to  provide  some  intuition  on  the  behavior  of  the 
solution  we  consider  the  case  when  the  set  of  linear  constraints 
uniquely  determine  the  importance  indices.  In  that  case  the 
energy  function  reduces  to 

E  (lAil  +  P{hj  ~  lij)2) 

i,je  AT 

The  above  function  has  a  well-known  minimizer  in  the  uncon¬ 
strained  case  given  by 

Iij  =  shrink^,  1/2/3), 

where  the  function  shrink(x,  A)  for  A  >  0  is  given  by 

!x  —  A  if  x  >  A 

x  +  A  if  x  <  —A 

0  otherwise. 

Using  the  above  one  can  easily  verify  that  in  our  optimiza¬ 
tion  problem  if  1/2/3  >  then  the  unique  minimizer 

necessarily  satisfies  Itj  =  0.  The  latter  indicates  how  in 
general  a  sparse  interaction  profile  is  obtained.  Essentially,  all 
interactions  of  small  magnitude  are  automatically  set  to  0.  The 
shrink  function  plays  an  important  role  in  L 1  regularization 
problems  especially  in  methods  such  as  the  Split  Bregman 

[13]. 

The  exact  behavior  of  the  solution  of  our  optimization 
problem  is  not  completely  clear,  although  some  general  ten¬ 
dencies  such  as  the  one  above  describe  it  to  some  extent. 
In  Section  VI  we  look  into  two  specific  examples  of  the 
optimization  problem,  which  provide  additional  intuition  about 
the  minimizer.  In  both  examples  we  will  apply  our  model  with 
a  —  1  and  /3  =  5.  Based  on  the  above  remarks,  /3  =  5  implies 
that  all  interactions  for  which  /A  <  0.1  will  automatically 
be  set  to  zero,  as  we  consider  them  too  small.  The  full 
understanding  of  the  effect  of  the  parameter  a  requires  further 
research.  In  general,  higher  values  give  precedence  to  well- 
spread  importance  indices,  while  smaller  values  impose  closer 
adherence  to  the  values  given  by  the  user. 

V.  A  Vague  Fuzzy  Measure 

Having  built  our  fuzzy  measure  reflecting  criteria  impor¬ 
tance  and  interactions  we  now  proceed  with  modeling  the 
vague  statement,  given  by  the  DM.  The  typical  way  vague 
statements  have  been  modeled  in  the  literature  on  MCDA  is  by 
using  the  ordered  weighted  average  (OWA)  and  the  weighted 
OWA  (WOWA)  [48],  [43],  [40],  There  exist  well  developed 
methods  for  constructing  weighing  vectors  for  the  OWA  and 
we  choose  that  of  RIM  Quantifiers,  based  on  the  truncated 
Gaussian  distribution  [1], 


The  next  step  in  our  construction  involves  combining  the 
two  sets  of  parameters  into  a  vague  fuzzy  measure,  which 
will  be  the  focus  of  this  section.  As  mentioned  earlier  there  is 
no  accepted  way  of  integrating  these  parameters  and  we  thus 
develop  a  novel  method  for  accomplishing  this  task.  Using 
the  definition  of  the  WOWA  as  motivation  we  propose  the 
following  formulation. 

Definition  6.  Given  a  fuzzy  measure  v  and  a  weighing  vector 
w  we  define  a  fuzzy  measure  p  as 

H{A)  =  W(v(A)),  for  all  Ag  2m 

where  as  usual  2-v"  is  the  power  set  of  Af,  and  W  is  a 
monotone  non-decreasing  function  that  interpolates  the  points 
Wj)  together  with  the  point  (0,0).  Moreover,  W  is 
required  to  have  the  following  two  properties: 

1)  W{i)  =  'Zj'<iwj.i  =  0,-r  ,n; 

2)  W  is  linear  if  the  points  ( /  ,  <,  wj )  he  on  a  straight 

line. 

With  the  above  specific  composition,  we  propose  a  vague 
fuzzy  measure.  Let  us  note  that  the  obtained  measure  //  is 
indeed  a  fuzzy  measure,  since  W  is  monotone  and  preserves 
zero.  One  can  define  many  different  functions  W  with  the 
properties  in  the  above  definition  such  as  a  linear  spline, 
a  monotone  quadratic  spline,  monotone  cubic  spline,  etc. 
In  particular,  throughout  this  paper  W  is  piecewise  linear, 
interpolating  the  points  wj)- 

A.  Constructing  the  weights 

There  are  many  choices  for  the  weighing  vector  w.  Some 
typical  examples  seen  in  literature  define  w  as  the  unique 
solution  to  a  different  optimization  problem  such  as  the 
maximal  entropy  OWA  and  the  minimal  variance  OWA  [12], 
Yet  other  ways  involve  the  use  of  weight  generating  functions 
[1].  In  Definition  6  there  is  no  restriction  on  what  the  weighing 
vector  can  be,  so  long  as  it  is  normalized  and  has  nonnegative 
entries.  Based  on  experiments,  we  favor  the  use  of  weight 
generating  function,  specifically,  those  based  on  the  truncated 
Gaussian  distribution. 

Given  a  vague  statement  of  the  form  “k  out  of  n  criteria 
are  satisfied",  we  construct  a  Gaussian  distribution  with  mean 
to  =  and  variance  a.  The  weights  are  explicitly  given 

as: 

Definition  7.  If  w  is  a  weighing  vector  of  size  n,  we  have  that 
1  rk 

Wi  =  —  e  ax 

K  Ji=y 

where  K  is  a  normalization  constant. 

We  choose  2kfn1 ,  because  it  lies  precisely  between  and 

which  would  mean  that  the  largest  entry  of  w  is  precisely 
the  /c-th  entry.  This  would  imply  that  the  ordered  weighted 
average  of  a  vector  is  high,  precisely  when  it  has  at  least  k 
high  entries.  And  we  motivate  the  latter  with  a  simple  example. 
Example:  We  are  given  the  vague  statement  ”at  least  3  (out  of 
5)  entries  of  z  are  high”.  From  the  above  definition  m  =  0.5 
and  for  cr  =  0.1  and  w  =  (0.001,0.157,0.683,0.157,0.001). 
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Then  w  has  a  clear  emphasis  on  the  third  weight,  which 
implies  that  the  only  way  that  OILAw(z)  =<  w,z  \>  is 
high  is  if  the  third  highest  entry  of  z  is  high,  meaning  that  z 
contains  at  least  three  high  entries. 

B.  Properties  of  the  vague  Choquet  integral 

Based  on  the  above  formulation  of  the  weighing  vector  w 
and  the  measure  v,  constructed  in  the  previous  section,  we 
can  now  construct  //  (as  in  the  formulation  of  Definition  6). 
Our  final  aggregation  function  is  just  a  Choquet  integral  with 
respect  to  //,  which  we  denote  by  Cj/W  to  show  its  relationship 
to  both  w  and  v. 

We  next  wish  to  examine  some  of  the  basic  properties 
of  the  newly  constructed  aggregation  function.  And  we  will 
begin  by  first  looking  into  a  couple  of  examples  that  compare 
how  the  function  behaves,  compared  to  OWAw  and  CL/  (the 
Choquet  integral  with  respect  to  v).  For  the  purposes  of  the 
following  examples  we  will  not  use  the  optimization  method  of 
construction  for  v ,  but  rather  just  a  simple  handmade  measure. 
Example  2a:  Suppose  that  we  have  three  criteria  - 
{ K  \ .  K  > ,  K:i }  and  we  are  given  that  K\  ~  K  >  ~  2  K:i . 
In  addition,  let  K\  and  K2  have  40%  synergy  and  let  the 
vague  statement  provided  be  “at  least  two  scores  are  high.” 
We  consider  two  vectors  x  and  y,  such  that  x  =  [0.85  0.8  0.1] 
and  y  =  [0.3  0.2  0.8]. 

The  importance  and  interaction  indices  that  define  v 
uniquely  are:  Ji  =  I2  =  0.4, 1$  =  0.2,  I12  =  0.4  and  I23  = 
/13  =  0.  The  weighing  vector  w  in  OWA  is  w  =  [0.1  0.8  0.1]. 
We  calculate  the  aggregated  value  of  the  two  vectors  using  the 
different  aggregation  functions.  The  results  are  summarized  in 
Table  I. 


Vector 

OWAw 

cv 

Gw,  V 

X 

0.735 

0.670 

0.761 

y 

0.340 

0.340 

0.256 

TABLE  I 


As  can  be  seen  from  Table  I  the  aggregated  value  for  x 
using  Gj/  W  is  higher  than  those  obtained  from  using  OWAw 
and  Cv,  while  the  one  for  y  is  lower.  Both  the  OWA  and  the 
Choquet  integral  agree  that  x  should  receive  a  high  aggregated 
score,  which  implies  that  x  satisfies  the  constraints  in  both 
operators  well.  Looking  at  x  we  can  clearly  see  that  it  has 
two  high  entries,  which  correspond  to  the  two  most  important 
criteria  (K\  and  Kf),  which  are  also  synergetic.  In  addition,  x 
satisfies  well  the  vague  statement.  Thus  if  we  combine  the  two 
informations,  we  should  obtain  that  x  receives  an  overall  high 
score,  which  is  precisely  the  result  we  obtained.  Similarly,  for 
y  we  can  see  that  its  lowest  values  correspond  to  the  most 
important  criteria  K\  and  /\  2  •  In  addition,  y  does  not  satisfy 
the  given  vague  statement  well,  and  thus  should  receive  a  low 
score  as  is  the  case.  If  a  vector  performs  well  on  both  kinds 
of  information  -  importance/interactions  and  vague  statement, 
then  Cj/  W  assigns  it  a  higher  value  than  either  OWAw  or  C„ . 
And  if  the  vector  performs  badly  on  all  counts  it  receives  an 
even  lower  score.  This  is  the  desired  behavior  that  we  aim 


Vector 

OWAw 

cv 

n 

X 

[0.1  0.8  0.1] 

[0.2  0.6  0.2] 

[0.06  0.88  0.06] 

y 

[0.8  0.1  0.1] 

[0.2  0.6  0.2] 

[0.74  0.20  0.06] 

TABLE  II 


for.  Yet  it  is  interesting  to  look  into  exactly  how  the  vague 
fuzzy  measure  redistributes  the  weights  in  OWA,  according  to 
the  information  contained  in  the  fuzzy  measure  //.  In  order  to 
answer  this  question  let  us  look  at  the  previous  example  in 
more  detail. 

Example  2b:  As  before  x  =  [0.85  0.8  0.1]  and  y  = 
[0.3  0.2  0.8].  Let  us  consider  the  “weights”  that  each  entry 
receives  using  the  OWA,  the  Choquet  integral  and  the  vague 
Choquet  integral.  Specifically,  for  the  two  integrals  we  shall 
call  the  k- th  “weight”  the  parameter  that  appears  in  front  of 
the  fc-th  index  (i.e.  u({j\zj  >  z{k)})  -  v({j\z3  >  z{k+l)}) 
in  the  language  of  Definition  2).  The  weights  for  x  and  y  are 
summarized  in  Table  II. 

Looking  at  Table  II  we  can  at  least  intuitively  see  how 
the  proposed  model  operates.  Consider  the  vector  x.  We  see 
that  both  OWAw  and  CL/  agree  that  the  second  entry  should 
have  a  high  weight,  which  results  in  a  very  high  weight  in 
Cw,„.  Similarly  as  the  first  and  last  entry  of  x  receive  low 
weights  in  both  operators,  those  become  even  lower  using 
the  vague  Choquet  integral.  Thus  when  both  the  OWA  and 
Choquet  integral  agree  that  a  certain  entry  should  receive  a 
high  weight,  this  translates  to  an  even  higher  weight  in  the 
vague  Choquet  integral  and  the  converse  is  also  true. 

It  is  worth  examining  how  the  function  behaves  when  the 
information  from  the  OWA  and  the  Choquet  integral  is  not 
consistent.  As  an  example  consider  the  vector  y  where  the 
aggregation  functions  disagree  on  the  first  two  entries.  In  this 
case,  the  first  weight  in  CWi„  can  be  understood  as  a  skewed 
down  version  of  the  weight  in  OWAw.  The  reason  for  the 
reduction  is  the  low  weight  assigned  by  Cv  for  the  same  entry. 
Conversely  the  second  entry  of  y  receives  a  higher  weight  in 
Cw,„  than  in  OW Aw  because  its  weight  in  Cv  is  high.  In 
general  weights  in  the  OWA  are  skewed  according  to  those 
in  the  Choquet  integral,  although  the  exact  degree  depends  on 
the  parameters  of  the  models. 

C.  Some  Important  Cases 

The  proposed  model  has  some  additional  nice  properties, 
apart  from  the  ones  already  described.  In  particular  it  reduces 
to  some  interesting  aggregation  functions  in  specific  cases.  We 
summarize  these  properties  below. 

1)  If  all  interactions  among  criteria  are  zero,  the  vague 
Choquet  integral  becomes  the  WOWAwj,  where  w  is 
the  weighing  vector  of  OWA  and  I  is  the  Shapley  value. 

2)  If  all  interactions  among  criteria  are  zero  and  all  cri¬ 
teria  are  equally  important,  the  vague  Choquet  integral 
becomes  the  OWAw. 

3)  If  w  =  [i, ...,  L])  the  vague  Choquet  integral  becomes 
a  Choquet  integral  with  respect  to  the  fuzzy  measure  v. 
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4)  If  w  =  [-, -]  and  all  interactions  among  criteria  are 
zero,  the  vague  Choquet  integral  becomes  a  weighted 
mean  with  a  weighing  vector  I. 

5)  If  w  =  [-, -],  all  interactions  among  criteria  are  zero 
and  all  criteria  are  equally  important,  the  vague  Choquet 
integral  becomes  the  arithmetic  mean. 

All  of  the  above  properties  can  directly  be  verified  by  the 
definition  of  the  vague  fuzzy  measure. 

Remark:  The  two  steps  that  we  implemented  in  constructing 
our  vague  measure  are  completely  autonomous,  meaning  that 
each  step  can  be  applied  individually  to  different  problems, 
for  which  they  are  suited. 

VI.  Model  Evaluation 


the  following  optimization  problem,  which  we  solve  using  cvx. 

minimize  a  ^  I,  log  h+  ^  (| hj  \  +  /3{Iij  -  /°)2) 
hJii  ieM  i,jeAf 

subject  to  {Ii,  Iij )  £  A 

The  feasible  set  A  is  given  by 

r  5  5  ! 

A  =  <  =  0  Vfc,  ^  L,  =  1;  and  It  >  -  ^  \LtJ\ 

[®-l  *=1  tyj 

and  the  values  atk  are  the  entries  of  the  matrix 

1-10  0  0  \ 
10-20  0 
10  0-3  0 

1  0  0  0  -4  / 


In  this  section  we  present  a  practical  implementation  of  our 
proposed  approach.  We  consider  several  examples  of  synthetic 
data  and  describe  the  actions  of  the  developed  aggregation 
functions.  Our  goal  is  to  illustrate  the  contribution  of  each 
step  in  our  construction  of  the  vague  Choquet  integral. 

The  proposed  approach  was  implemented  within  MATLAB 
[51],  The  optimization  problem  is  solved  using  cvx  -  software 
package  for  MATLAB  (for  strictly  convex  optimization)  [2], 

A.  Example  I 

The  general  setup  is  as  follows.  A  DM  chooses  from  a  set 
of  six  options,  which  can  be  compared  with  respect  to  five 
criteria.  To  each  of  the  six  alternatives  we  associate  a  vector 
in  M5,  whose  entries  represent  how  well  the  element  satisfies 
each  criterion.  As  before  we  refer  to  the  entries  of  the  vectors 
as  scores  and  all  scores  are  assumed  to  be  on  the  same  scale 
[0, 1].  The  DM  provides  a  set  of  preferences  and  the  task  is 
to  create  an  aggregation  function  modeling  those  preferences. 

We  denote  the  five  criteria  using  the  same  notation  as  in 
Section  V,  namely  A'j .  A'2,  ■  ■  ■  ,  K$.  Suppose  that  the  DM  has 
the  following  set  of  preferences  relating  the  importance  of  the 
criteria. 

A'-|  ~  I<2  ~  2 1<3  ~  3 IU  ~  4 1<5, 

i.e.  the  DM  believes  that  the  first  criterion  is  just  as  important 
as  the  second  one,  is  twice  as  important  as  the  third  etc.  As 
discussed  before  we  interpret  the  above  set  of  preferences  in 
terms  of  a  linear  relationship  between  the  importance  indices 
of  a  2-additive  fuzzy  measure  (e.g.  I\  =  I2  =  2/3). 

In  addition,  suppose  the  following  set  of  interactions  is 
given. 

ij°2  =  0.4  /£,  =  -0.15  ij°4  =  0.05 
1$ 3  =  0.25  1% 4  =  0.2  II 5  =  -0.25 

Remark:  The  formulation  of  the  interaction  index  requires 
that  Iij  =  Iji  for  all  pairs  of  criteria  i,j.  We  thus  use  the 
convention  that  I\}2  =  0.4  implies  J21  =  0.4  and  vice  versa. 
Also  as  was  mentioned  in  Section  IV,  all  interactions  for  which 
no  information  is  given  are  set  to  0. 

Summarizing  the  above  set  of  preferences  as  well  as  the 
well-posedness  conditions  for  the  fuzzy  measure,  we  obtain 


The  obtained  solution  to  the  above  problem  using  a  =  1 
and  {3  =  5,  as  discussed  in  the  end  of  Section  IV,  is 

Ii  =  0.3243,  I2  =  0.3243,  J3  =  0.1622, 

(6) 

Ii  =  0.1081,  I5  =  0-0811 

J12  =  0.3,  /13  =  —0.05,  J23  =  0.15 
J34  =  0.0831,  Z45  =  -0.1331 

In  addition,  as  remarked  earlier,  we  have  that  /,y  =  I:j,  so  that 
the  above  equalities  are  valid  for  the  corresponding  interaction 
indices  and  all  other  interactions  are  0. 

We  remark  that  no  2-additive  measure  can  be  defined 
exactly  on  the  set  of  preferences  given  by  the  user.  The  latter 
can  be  seen  by  first  noting  that  the  equality  constraints  for  the 
importance  indices  together  with  the  normalization  constraint 
uniquely  determine  the  the  solution  in  (6).  This  implies  that 
one  of  the  conditions  for  monotonicity  /3  >V2E<  141  <=» 
0.1622  >  0.2250  necessarily  fails. 

The  next  step  of  our  construction  involves  designing  a 
weighing  vector  for  an  OWA  operator  capturing  a  vague 
statement.  We  consider  the  vague  statements,  which  require 
“some”  and  “most”  of  the  criteria  to  be  satisfied.  Since  we 
are  working  with  5  criteria,  we  will  interpret  the  two  vague 
statements  as  requiring  2  and  respectively  3  criteria  to  be 
satisfied.  In  our  method  for  constructing  the  weighing  vectors, 
described  in  Section  V,  we  set  k  =  2  and  k  =  3,  respectively 
and  a  =  0.10.  We  obtain  the  following  weighing  vectors. 

w  1  =  [0.1575  0.6836  0.1575  0.1575  0.0000]  (8) 

w2  =  [0.0013  0.1573  0.6827  0.1573  0.0013]  (9) 

The  choice  for  a  is  somewhat  arbitrary.  In  general,  larger 
values  lead  to  more  spread  out  entries  in  the  weighing  vectors, 
whereas  smaller  values  give  more  concentrated  distributions. 
In  our  case,  we  have  just  over  two  thirds  of  the  weight 
concentrated  in  the  second  and  third  entry  of  the  above 
vectors  respectively.  The  latter  gives  a  sharp  difference  in  the 
aggregated  values  (using  OWA)  between  vectors  that  satisfy 
the  vague  statement  poorly  and  those  that  satisfy  it  well. 

The  final  step  in  our  construction  involves  building  the 
two  vague  fuzzy  measures  (corresponding  to  the  two  vague 
statements),  using  the  2-added  measure  obtained  above  as  well 
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Criterion 

Zl 

Z2 

z3 

z4 

Z5 

Z0 

Kx 

0.6 

0 

0.8 

0.1 

0.1 

0.5 

1<2 

0.65 

0.55 

0.75 

0.75 

0.2 

0 

K.i 

0.65 

0.85 

0 

0.75 

0.9 

1 

k4 

0.8 

0.9 

0.1 

0.75 

0.9 

0 

k5 

0.15 

1 

0.8 

0.5 

0.9 

1 

TABLE  III 


as  wi  and  W2.  We  accomplish  this  as  before  by  applying  the 
Definition  6  in  Section  V. 

We  next  consider  the  six  vectors  representing  the  alter¬ 
natives  the  DM  is  to  choose  from.  We  denote  the  latter  by 
zi,  Z2,  •  •  •  ,  z@.  The  vectors  are  given  in  Table  III. 

We  are  next  interested  in  the  aggregated  values  that  are 
obtained  using  the  2-additive  Choquet  integral,  the  OWA 
operator  and  the  vague  Choquet  integral.  The  results  are 
summarized  in  Tables  IVa  and  IVb.  The  results  in  Table  IVa 
are  based  on  using  wi  for  the  OWA  operator  and  the  vague 
Choquet  integral,  while  those  in  Table  IVb  depend  on  W2. 

TABLE  IV 


(a)  Results  using  wi 


Vector 

OWAWl 

cv 

n 

v-'ZAWi 

Zl 

0.6736 

0.6402 

0.6631 

Z2 

0.9074 

0.4154 

0.6965 

Z3 

0.7912 

0.5771 

0.7827 

z4 

0.7497 

0.4543 

0.721 

Z5 

0.8991 

0.366 

0.5487 

ze 

0.9199 

0.2929 

0.475 

(b)  Results  using  W2 


Vector 

owaW2 

cv 

^1/,W2 

Zl 

0.6417 

0.6402 

0.6313 

Z2 

0.8097 

0.4154 

0.3946 

Z3 

0.6547 

0.5771 

0.7231 

z4 

0.7098 

0.4543 

0.525 

Z5 

0.7888 

0.366 

0.224 

ze 

0.5 

0.2929 

0.0744 

Firstly,  we  consider  the  action  of  the  Choquet  integral  on  the 
given  vectors.  According  to  Table  IV  the  highest  aggregated 
values  belong  to  z1  and  z3,  while  the  lowest  to  z5  and  z6. 
Looking  at  Table  III  and  the  parameters  (6)  and  (7)  it  is  easy 
to  explain  the  obtained  results.  Indeed,  Zi  satisfies  well  the 
four  most  important  criteria,  which  positively  interact  with  one 
another,  while  z3  satisfies  very  well  the  two  most  important 
criteria  with  the  highest  level  of  synergy.  On  the  other  hand, 
z5  and  z6  mostly  satisfy  the  least  important  criteria  and  thus 
receive  a  lower  aggregated  value. 

The  next  action  of  interest  is  that  of  the  OWA  operator. 
From  Table  IVa  we  notice  that  Z2,  Z5  and  zfi  all  receive  very 
high  aggregated  values,  which  is  precisely  because  they  have 
at  least  two  entries  >  0.90.  The  vector  zi  receives  the  lowest 
value  as  it  contains  only  one  relatively  high  entry,  while  z3 


and  Z4  lie  in  between.  The  picture  dramatically  changes  once 
we  change  the  vague  statement  as  can  be  seen  in  Table  IVb. 
The  highest  aggregated  values,  belong  to  z2  and  z5,  while 
Z0  receives  the  lowest  rank.  The  vector  Z6  has  exactly  two 
very  high  entries,  which  implies  that  it  satisfies  the  first  vague 
statement  well,  while  the  second  very  poorly  and  thus  the 
change  in  its  rank  when  applying  the  two  OWA  operators  is 
not  only  expected,  but  in  fact  desired.  Our  weighing  vectors 
were  designed  to  differentiate  sharply  between  alternatives  that 
satisfy  the  vague  statement  well  and  poorly,  which  is  precisely 
what  we  observe  in  Table  IV. 

Finally,  we  look  into  the  results  for  the  vague  Choquet 
integral.  For  now  we  restrict  our  attention  to  Table  IVa. 
According  to  the  vague  Choquet  integral,  the  optimal  choice 
is  z3,  while  the  worst  is  Z6-  We  remark  that  z3  satisfies  well 
both  the  preferences  regarding  the  importance  and  interactions 
as  well  as  the  vague  statement  as  can  be  seen  in  Table  IVa. 
It  has  relatively  high  scores  with  respect  to  OWAWl  and  Cv 
and  this  motivates  its  top  position  when  applying  the  vague 
Choquet  integral.  A  similar  argument  can  be  made  about  Z4, 
which  again  consistently  received  good  scores  with  respect 
to  the  first  two  aggregators.  Conversely,  za  and  zj,  which 
were  the  top-ranking  vectors  according  to  OWAWl  and  Cv 
respectively,  both  receive  a  lower  overall  score,  because  they 
fail  to  satisfy  some  portion  of  the  preferences  well. 

The  situation  looks  similar  when  analyzing  the  results  in 
Table  IVb.  We  again  observe  that  z3  performs  well  as  it 
satisfies  all  types  of  preferences  well.  On  the  other  hand  we 
notice  a  shift  in  the  ranking  of  z1,  which  now  possesses 
the  second  overall  ranking  according  to  the  vague  Choquet 
integral.  The  latter  is  due  to  its  better  relative  performance 
on  the  vague  statement,  compared  to  Table  IVa.  The  worst 
performing  vector  is  again  Z6,  although  we  see  a  dramatic 
change  in  its  aggregated  value,  compared  to  that  in  Table  IVa. 
za  poorly  satisfies  both  types  of  preferences,  which  implies 
that  it  should  receive  an  even  lower  overall  score  as  is  the 
case. 

The  three  aggregation  functions  we  considered  lead  to  a 
different  ordering  of  the  alternatives,  which  individually  and 
compared  with  respect  to  one  another  provide  a  reasonable 
ranking  of  the  given  alternatives.  The  vague  Choquet  integral 
clearly  gives  precedence  to  alternatives,  which  perform  consis¬ 
tently  well  on  both  types  of  user  preferences,  while  penalizing 
options  that  fail  to  satisfy  either.  This  is  the  general  behavior 
we  set  out  to  model. 

B.  Example  II 

In  the  previous  example  we  looked  into  how  the  different 
steps  of  the  construction  of  our  final  aggregation  function  acted 
on  a  set  of  vectors.  In  this  example  we  are  interested  in  how 
different  vague  Choquet  integrals  (based  on  different  sets  of 
preferences)  rate  the  same  set  of  alternatives.  We  can  interpret 
this  example  as  having  several  DMs  with  different  preferences, 
choosing  from  the  same  options.  We  aim  to  illustrate  how 
different  preferences  translate  into  different  choices  based  on 
our  model.  To  this  end,  we  consider  a  second  set  of  preferences 
for  the  importance  and  interactions  of  criteria. 
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Suppose  that  a  different  DM  has  the  following  preferences. 

K4  ~  AK\  2  K:>,  ~  3/i5, 

In  addition,  suppose  that  the  he/she  believes  the  interactions 
should  be: 

I?2  =  0.2  I%3  =  0.35  /2°5  =  0.15 

4  =  0.05;  73°5  =  0.1;  J4°5  = -0.3 

Based  on  the  above  preferences  we  obtain  the  following 
values  for  the  importance  and  interaction  indices. 

h  =  0.0716  I2  =  0.217  I3  =  0.2551 

(10) 

IA  =  0.2862  I5  =  0.1701 

7i2  —  0.1,  723  =  0.25,  725  =  0.05,  J45  =  -0.2  (11) 

Let  us  denote  by  v\  and  ^2  the  2-additive  measures 
based  on  the  preferences  in  the  previous  and  in  this  ex¬ 
ample  respectively.  In  addition,  we  will  consider  the  same 
two  vague  statements  as  before,  which  are  modeled  via 
the  weighing  vectors  w4  and  w2  (given  in  (8)  and  (9)). 
Finally,  we  shall  consider  the  ordering  of  the  alternatives 
zi,z2,  •  •  •  ,  Zg,  induced  by  the  four  vague  Choquet  integrals 
-  C'^1)W2,(7^2)Wi,C1/2iW2.  The  aggregated  values  are 

provided  in  Table  Va  and  for  clarity  we  present  the  ordering 
of  the  vectors  in  Table  Vb. 

TABLE  V 


(a)  Aggregated  values 


Vector 

r1 

w ,Wi 

r 

w  ,W2 

n 

UP2,Wi 

r 

VP2,W2 

Zl 

0.6631 

0.6313 

0.7691 

0.6712 

z2 

0.6965 

0.3946 

0.915 

0.7795 

Z3 

0.7827 

0.7231 

0.6831 

0.2661 

z4 

0.721 

0.525 

0.75 

0.7461 

Z5 

0.5487 

0.224 

0.8778 

0.6969 

ze 

0.475 

0.0744 

0.7934 

0.1477 

(b)  Rank  of  vectors 


Vector 

,Wi 

,W2 

Ci/2  ,Wi 

^P2,W2 

Zl 

3 

3 

2 

2 

Z2 

4 

1 

5 

4 

Z3 

2 

4 

6 

5 

z4 

1 

2 

1 

1 

Z5 

5 

5 

4 

3 

Z0 

6 

6 

3 

6 

As  can  be  seen  from  Table  Vb  the  different  sets  of  pref¬ 
erences  naturally  lead  to  different  ranking  of  the  alternatives. 
When  the  preferences  of  the  DM  are  modeled  using  v\,  we 
see  that  for  both  vague  statements  z3  is  the  optimal  choice. 
In  particular,  z3  has  three  high  entries  and  thus  satisfies  both 
vague  statements  well.  A  similar  situation  can  be  seen  for  z2, 
when  the  2-additive  measure  used  is  ^2. 

In  order  to  see  how  vague  statements  change  the  rank  of 
an  option  we  only  need  to  consider  z6.  z0  has  exactly  two 
high  entries  and  thus  it  is  a  good  representative  of  a  vector 
that  satisfies  only  one  of  the  vague  statements  well.  And  as 


can  be  seen  in  Table  Vb,  when  the  2-additive  measure  is  ;/2, 
the  ranking  of  Z6  drastically  changes  depending  on  the  given 
vague  statement.  Such  a  situation  is  not  observed  when  U\ 
is  used,  because  as  seen  in  Table  IVa,  Z6  fails  to  satisfy  the 
preferences  modeled  by  v4,  and  its  final  aggregated  score  is 
lowest  when  either  vague  statement  is  used. 

Finally,  we  remark  on  the  effect  of  the  2-additive  measure, 
reflecting  the  importance  and  interactions.  When  v\  is  used  z3 
is  the  best  option,  as  it  has  high  entries  in  the  most  important 
and  most  synergetic  two  criteria.  That  is  no  longer  the  case 
when  the  preferences  are  modeled  by  z/2,  and  as  can  be  seen 
in  Table  Vb  the  rank  of  z3  drops  to  the  very  bottom. 

VII.  Conclusion 

We  presented  a  novel  and  efficient  method  for  construct¬ 
ing  an  aggregation  function  capturing  two  types  of  infor¬ 
mation:  vague  statements  as  used  in  the  OWA  and  impor¬ 
tance/interactions  of  criteria  as  used  in  the  Choquet  integral. 
The  proposed  aggregation  function,  called  a  vague  Choquet 
integral,  successfully  combines  two  types  of  information  that 
were  previously  unintegrated.  The  proposed  model  is  built 
automatically  and  without  the  need  of  any  data  learning. 
The  vague  fuzzy  measure,  underlying  our  final  model,  is 
constructed  by  combining  a  2-additive  fuzzy  measure  and 
a  weighing  vector  capturing  partial  user  information.  The 
weighing  vector  reflects  the  given  vague  statement  and  is 
constructed  using  weight  generating  functions  based  on  the 
truncated  Gaussian  distribution.  The  2-additive  fuzzy  measure 
reflects  importance  and  interactions  of  criteria  and  is  obtained 
by  minimizing  an  L 1  energy  function.  The  solution  to  the 
optimization  problem  is  a  2-additive  fuzzy  measure,  whose 
interaction  profile  is  sparse.  To  our  knowledge  this  paper 
presents  the  first  application  of  L 1  optimization  in  multi¬ 
criteria  decision-making  problems,  which  provides  an  auto¬ 
matic  construction  of  a  fuzzy  measure. 

The  proposed  model  was  shown  to  better  incorporate  a 
complex  set  of  user  preferences  than  the  OWA  or  the  Choquet 
integral  separately.  Some  future  research  includes  finding 
alternative  and  better  ways  to  interpret  user  preferences  as 
constraints  within  the  optimization  problem.  In  addition,  it 
would  be  interesting  to  consider  a  MACBETH-type  approach 
for  reflecting  the  contribution  of  each  preference  to  the  final 
aggregated  value.  This  would  lead  to  a  better  understanding 
of  the  presented  models  and  improve  their  applicability  to  real 
world  problems. 
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